Max-Margin feature selection
نویسندگان
چکیده
Many machine learning applications such as in vision, biology and social networking deal with data in high dimensions. Feature selection is typically employed to select a subset of features which improves generalization accuracy as well as reduces the computational cost of learning the model. One of the criteria used for feature selection is to jointly minimize the redundancy and maximize the relevance of the selected features. In this paper, we formulate the task of feature selection as a one class SVM problem in a space where features correspond to the data points and instances correspond to the dimensions. The goal is to look for a representative subset of the features (support vectors) which describes the boundary for the region where the set of the features (data points) exists. This leads to a joint optimization of relevance and redundancy in a principled max-margin framework. Additionally, our formulation enables us to leverage existing techniques for optimizing the SVM objective resulting in highly computationally efficient solutions for the task of feature selection. Specifically, we employ the dual coordinate descent algorithm (Hsieh et al., 2008), originally proposed for SVMs, for our formulation. We use a sparse representation to deal with data in very high dimensions. Experiments on seven publicly available benchmark datasets from a variety of domains show that our approach results in orders of magnitude faster solutions even while retaining the same level of accuracy compared to the state of the art feature selection techniques. c © 2016 Elsevier Ltd. All rights reserved.
منابع مشابه
Explicit Max Margin Input Feature Selection for Nonlinear SVM using Second Order Methods
Incorporating feature selection in nonlinear SVMs leads to a large and challenging nonconvex minimization problem, which can be prone to suboptimal solutions. We use a second order optimization method that utilizes eigenvalue information and is less likely to get stuck at suboptimal solutions. We devise an alternating optimization approach to tackle the problem efficiently, breaking it down int...
متن کاملPrimal explicit max margin feature selection for nonlinear support vector machines
Embedding feature selection in nonlinear SVMs leads to a challenging non-convex minimization problem, which can be prone to suboptimal solutions. This paper develops an effective algorithm to directly solve the embedded feature selection primal problem. We use a trust-region method, which is better suited for non-convex optimization compared to line-search methods, and guarantees convergence to...
متن کاملBayesian Max-margin Multi-Task Learning with Data Augmentation
Both max-margin and Bayesian methods have been extensively studied in multi-task learning, but have rarely been considered together. We present Bayesian max-margin multi-task learning, which conjoins the two schools of methods, thus allowing the discriminative max-margin methods to enjoy the great flexibility of Bayesian methods on incorporating rich prior information as well as performing nonp...
متن کاملMax-Margin Nonparametric Latent Feature Models for Link Prediction
Link prediction is a fundamental task in statistical network analysis. Recent advances have been made on learning flexible nonparametric Bayesian latent feature models for link prediction. In this paper, we present a max-margin learning method for such nonparametric latent feature relational models. Our approach attempts to unite the ideas of max-margin learning and Bayesian nonparametrics to d...
متن کاملMargin-based feature selection for hyperspectral data
A margin based feature selection approach is explored for hyperspectral data. This approach is based on measuring the confidence of a classifier when making predictions on a test data. Greedy feature flip and iterative search algorithms, which attempts to maximise the margin based evaluation functions, were used in the present study. Evaluation functions use linear, zero-one and sigmoid utility...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Pattern Recognition Letters
دوره 95 شماره
صفحات -
تاریخ انتشار 2017